Scalable computation of kinship and identity coefficients on large pedigrees.

نویسندگان

  • En Cheng
  • Brendan Elliott
  • Z Meral Ozsoyoglu
چکیده

With the rapidly expanding field of medical genetics and genetic counseling, genealogy information is becoming increasingly abundant. An important computation on pedigree data is the calculation of identity coefficients, which provide a complete description of the degree of relatedness of a pair of individuals. The areas of application of identity coefficients are numerous and diverse, from genetic counseling to disease tracking, and thus, the computation of identity coefficients merits special attention. However, the computation of identity coefficients is not done directly, but rather as the final step after computing a set of generalized kinship coefficients. In this paper, we first propose a novel Path-Counting Formula for calculating generalized kinship coefficients, which is motivated by Wright's path-counting method for computing the inbreeding coefficient for an individual. We then present an efficient and scalable scheme for calculating generalized kinship coefficients on large pedigrees using NodeCodes, a special encoding scheme for expediting the evaluation of queries on pedigree graph structures. We also perform experiments for evaluating the efficiency of our method, and compare it with the performance of the traditional recursive algorithm for three individuals. Experimental results demonstrate that the resulting scheme is more scalable and efficient than the traditional recursive methods for computing generalized kinship coefficients.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Path-Counting Formulas for Generalized Kinship Coefficients and Condensed Identity Coefficients

An important computation on pedigree data is the calculation of condensed identity coefficients, which provide a complete description of the degree of relatedness of two individuals. The applications of condensed identity coefficients range from genetic counseling to disease tracking. Condensed identity coefficients can be computed using linear combinations of generalized kinship coefficients f...

متن کامل

A graphical algorithm for fast computation of identity coefficients and generalized kinship coefficients

UNLABELLED Computing the probability of identity by descent sharing among n genes given only the pedigree of those genes is a computationally challenging problem, if n or the pedigree size is large. Here, I present a novel graphical algorithm for efficiently computing all generalized kinship coefficients for n genes. The graphical description transforms the problem from doing many recursion on ...

متن کامل

Approximating identity-by-descent matrices using multiple haplotype configurations on pedigrees.

Identity-by-descent (IBD) matrix calculation is an important step in quantitative trait loci (QTL) analysis using variance component models. To calculate IBD matrices efficiently for large pedigrees with large numbers of loci, an approximation method based on the reconstruction of haplotype configurations for the pedigrees is proposed. The method uses a subset of haplotype configurations with h...

متن کامل

Correcting for Cryptic Relatedness in Genome-Wide Association Studies

While the individuals chosen for a genome-wide association study (GWAS) may not be closely related to each other, there can be distant (cryptic) relationships that confound the evidence of disease association. These cryptic relationships violate the GWAS assumption regarding the independence of the subjects’ genomes, and failure to account for these relationships results in both false positives...

متن کامل

Two-locus and three-locus gene identity by descent in pedigrees.

Although there have been several mathematical formulations of multilocus segregation, multilocus gene identity by descent in pedigrees has been little considered. Here we present a computationally feasible algorithm for the computation of two-locus kinship for individuals between whom there may be multiple complex relationships, and use it to investigate patterns of two-locus gene identity by d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computational systems bioinformatics. Computational Systems Bioinformatics Conference

دوره 7  شماره 

صفحات  -

تاریخ انتشار 2008